NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Tale of Two Regulatory Regimes: Creation and Analysis of a Bilingual Privacy Policy Corpus

Arora, Siddhant; Hosseini, Henry; Utz, Christine; Bannihatti Kumar, Vinayshekhar; Dhellemmes, Tristan; Ravichander, Abhilasha; Story, Peter; Mangat, Jasmine; Chen, Rex; Degeling, Martin; et al (May 2022, LREC proceedings)

Over the past decade, researchers have started to explore the use of NLP to develop tools aimed at helping the public, vendors, and regulators analyze disclosures made in privacy policies. With the introduction of new privacy regulations, the language of privacy policies is also evolving, and disclosures made by the same organization are not always the same in different languages, especially when used to communicate with users who fall under different jurisdictions. This work explores the use of language technologies to capture and analyze these differences at scale. We introduce an annotation scheme designed to capture the nuances of two new landmark privacy regulations, namely the EU’s GDPR and California’s CCPA/CPRA. We then introduce the first bilingual corpus of mobile app privacy policies consisting of 64 privacy policies in English (292K words) and 91 privacy policies in German (478K words), respectively with manual annotations for 8K and 19K fine-grained data practices. The annotations are used to develop computational methods that can automatically extract “disclosures” from privacy policies. Analysis of a subset of 59 “semi-parallel” policies reveals differences that can be attributed to different regulatory regimes, suggesting that systematic analysis of policies using automated language technologies is indeed a worthwhile endeavor.
more » « less
Full Text Available
Hey Alexa, is this Skill Safe?: Taking a Closer Look at the Alexa Skill Ecosystem

https://doi.org/10.14722/ndss.2021.23111

Lentzsch, Christopher; Shah, Sheel Jayesh; Andow, Benjamin; Degeling, Martin; Das, Anupam; Enck, William (February 2021, Network and Distributed Systems Security (NDSS) Symposium2021)
null (Ed.)
Amazon's voice-based assistant, Alexa, enables users to directly interact with various web services through natural language dialogues. It provides developers with the option to create third-party applications (known as Skills) to run on top of Alexa. While such applications ease users' interaction with smart devices and bolster a number of additional services, they also raise security and privacy concerns due to the personal setting they operate in. This paper aims to perform a systematic analysis of the Alexa skill ecosystem. We perform the first large-scale analysis of Alexa skills, obtained from seven different skill stores totaling to 90,194 unique skills. Our analysis reveals several limitations that exist in the current skill vetting process. We show that not only can a malicious user publish a skill under any arbitrary developer/company name, but she can also make backend code changes after approval to coax users into revealing unwanted information. We, next, formalize the different skill-squatting techniques and evaluate the efficacy of such techniques. We find that while certain approaches are more favorable than others, there is no substantial abuse of skill squatting in the real world. Lastly, we study the prevalence of privacy policies across different categories of skill, and more importantly the policy content of skills that use the Alexa permission model to access sensitive user data. We find that around 23.3% of such skills do not fully disclose the data types associated with the permissions requested. We conclude by providing some suggestions for strengthening the overall ecosystem, and thereby enhance transparency for end-users.
more » « less
Full Text Available
A Tale of Two Regulatory Regimes: Creation and Analysis of a Bilingual Privacy Policy Corpus

Arora, Siddhant; Hosseini, Henry; Utz, Christine; Bannihatti, Vinayshekhar K.; Dhellemmes, Tristan; Ravichander, Abhilasha; Story, Peter; Mangat, Jasmine; Chen, Rex; Degeling, Martin; et al (January 2022, LREC proceedings)

Full Text Available
Personalized Privacy Assistants for the Internet of Things: Providing Users with Notice and Choice

https://doi.org/10.1109/MPRV.2018.03367733

Das, Anupam; Degeling, Martin; Smullen, Daniel; Sadeh, Norman (July 2018, IEEE Pervasive Computing)

Full Text Available

Search for: All records